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Abstract. We consider the solution of nonlinear programs with nonlinear semidefiniteness 
constraints. The need for an efficient exploitation of the cone of positive semidefinite matrices 
makes the solution of such nonlinear semidefinite programs more complicated than the solution 
of standard nonlinear programs. In particular, a suitable symmetrization procedure needs to be 
chosen for the linearization of the complementarity condition. The choice of the symmetrization 
procedure can be shifted in a very natural way to certain linear semidefinite subproblems, 
and can thus be reduced to a well-studied problem. The resulting sequential semidefinite 
programming (SSP) method is a generalization of the well-known SQP method for standard 
nonlinear programs. We present a sensitivity result for nonlinear semidefinite programs, and 
then based on this result, we give a self-contained proof of local quadratic convergence of the 
SSP method. We also describe a class of nonlinear semidefinite programs that arise in passive 
reduced-order modeling, and we report results of some numerical experiments with the SSP 
method applied to problems in that class. 

Key words, semidefinite programming, nonlinear programming, sequential quadratic pro¬ 
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1 Introduction 

In recent years, interior-point methods for solving linear semidefinite programs (SDPs) have 
received a lot of attention, and as a result, these methods are now very well developed; see, 
e.g., [.SOI \‘,V2\ . the papers in |S,5] . and the references given there. At each iteration of an interior- 
point method, the complementarity condition is relaxed, symmetrized, and linearized. Various 
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symmetrization operators are known. The choice of the symmetrization operator and of the 
relaxation parameter determine the step length at each iteration, and thus the efficiency of the 
overall method. 

In this paper, we are concerned with the solution of nonlinear semidefinite programs (NLS- 
DPs). Interior-point methods for linear SDPs can be extended to NLSDPs. However, some 
additional difficulties arise. First, the step length now also depends on the quality of the lin¬ 
earization of the nonlinear functions. Second, the choice of the symmetrization procedure is 
considerably more complicated than in the linear case since the system matrix is no longer 
positive semidefinite. To address these difficulties, in this paper, we consider an approach that 
separates the linearization and the symmetrization in a natural way, namely a generalization of 
the sequential quadratic programming (SQP) method for standard nonlinear programs. Such a 
generalization has already been mentioned by Robinson m within the more general framework 
of nonlinear programs over convex cones. This framework includes NLSDPs as a special case. 
While Robinson did not discuss implementational issues of such a generalized SQP approach, 
the recent progress in the solution of linear SDPs makes this approach especially interesting 
for the solution of NLSDPs. 

We first present a derivation of a generalized SQP method, namely the sequential semidef¬ 
inite programming (SSP) method, for solving NLSDPs. In order to analyze the convergence 
of the SSP method, we present a sensitivity result for certain local optimal solutions of gen¬ 
eral, possibly nonconvex, quadratic semidefinite programs. We then use this result to derive a 
self-contained proof of local quadratic convergence of the SSP method under the assumptions 
that the optimal solution is locally unique, strictly complementary, and satisfies a second-order 
sufficient condition. 

One of the first numerical approaches for solving a class of NLSDPs was given in ESI [211. 
Other recent approaches for solving NLSDPs are the program package LOQO I.SH] based on a 
primal-dual method; see also m- Another promising approach for solving large-scale SDPs is 
the modified-barrier method proposed in EZl. This modified-barrier approach does not require 
the barrier parameter to converge to zero, and may thus overcome some of the problems 
related to ill-conditioning in traditional interior-point methods. Further approaches to solving 
NLSDPs have been presented in III1II21I3I1. In mi, the augmented Lagrangian method is 
applied to NLSDPs, while the approach proposed in m is based on an SQP method generalized 
to NLSDPs. The paper |12| also contains a proof of local quadratic convergence. However, 
in contrast to this paper, the algorithm m is not derived from a comparison with interior- 
point algorithms, and the proof of convergence does not use any differentiability properties of 
the optimal solutions. In uni, Correa and Ramirez present a proof of global convergence of a 
modification of the method proposed in m- The modification employs certain merit functions 
to control the step lengths of the SQP algorithm. 

The remainder of this paper is organized as follows. In Section (21 we introduce some 
notation. In Section O we describe a class of nonlinear semidefinite programs that arise in 
passive reduced-order modeling. In Section 01 we recall known results for linear SDPs in a 
form that can be easily transferred to NLSDPs. In Section O we discuss primal-dual systems 
for NLSDPs, and in Sectional the SSP method is introduced as a generalized SQP method. In 
Section [71 we present sensitivity results, first for a certain class of quadratic SDPs, and then 
for general NLSDPs. Based on these sensitivity results, in Section |H1 we give a self-contained 
proof of local quadratic convergence of the SSP method. In Section 01 we present results of 
some numerical experiments. Finally, in Section 17171 we make some concluding remarks. 
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2 Notation 


Throughout this article, all vectors and matrices are assumed to have real entries. As usual, 

= \_yji\ denotes the transpose of the matrix Y = \_yij\- The vector norm ||x|| := V x^x is 
always the Euclidean norm and ||y|| := max|| 3 ,||=i ||Tx|| is the corresponding matrix norm. For 
vectors x G M”, x > 0 means that all entries of x are nonnegative, and Diag(x) denotes the 
n X n diagonal matrix the diagonal entries of which are the entries of x. The n x n identity 
matrix is denoted by In- 

The trace inner product on the space of real n x m matrices is given by 

n m 

{Z, Y) ■= Z»Y := trace(Z^y) = EE ^ij Vij 

i=l j=l 

for any pair Y = [yij] and Z = [zij~\ of n x m matrices. The space of real symmetric m x m 
matrices is denoted by 5™. The notation T E 0 (T 0) is used to indicate that Y G is 
symmetric positive semidefinite (positive definite). 

Semidefiniteness constraints are expressed by means of matrix-valued functions from M” 
to S^. We use the symbol A : M"" —> 5™ if such a function is linear, and the symbol B : M” —> 
5™ if such a function is nonlinear. 

Note that any linear function A : M” —> can be expressed in the form 

n 

El(x) = ^ Xjfor all x G M"’, (1) 

i=l 


with symmetric matrices 
introduce the norm 


G 5^ 


l,2,...,n. Based on the representation o we 

( n \ i 

i=l ^ 


of A. The adjoint operator A* : 5™ —> M" with respect to the trace inner product is defined 
by 

{A{x),Y) = {x,A*{Y)) =x^A*{Y) for all x G M" and Y £ . 


It turns out that 


A*{Y) 


A(b • Y 


jlA) , Y 


for all y G 5”^. 


(3) 


We always assume that nonlinear functions B : are at least C^-differentiable. We 

denote by 


B^^\x) := ^B{x) and := ^ B{x), i,j = l,2,...,n, 

OXi OXiOXj 

the first and second partial derivatives of B, respectively. For each x G M"", the derivative DxB 
at X induces a linear function D^B^x) : M” —> 5™, which is given by 


n 

DxB{x)[Ax] := ^(Ax)i5W(x) G for all Ax G M”. 

i=l 


3 






In particular, 


B{x + Ax) « B{x) + DxB{x)[Ax], Ax G M"', 
is the linearization of B at the point x. For any linear function A : M” —> S^, we have 

DxA{x)[Ax] = ^(Ax) for all x, Ax G M”. (4) 

We always use the expression on the right-hand side of to describe derivatives of linear 
functions. 

We remark that for any fixed matrix Y G 5”*, the map x i— > B{x) • y is a scalar-valued 
function of x G M”. Its gradient at x is given by 


V, {B{x)*Y) = [D^ {B{x)*Y)y = 
and its Hessian by 

(H(x).y) = 


'B^^\x)»Y' 

BA){x)»Y_ 


■H(i’i)(x)*y ••• h(1’’")(x)- y' 


G 


(5) 


G 5". 


_H(”’i)(x)*y ••• h(^'’^)(x)• y_ 

In particular, for any linear function A : M” —> S"^, in view of o, ©, and ®, we have 

v,,(^(x).y) =X(y). (6) 


3 An application in passive reduced-order modeling 

We remark that applications of linear SDPs include relaxations of combinatorial optimization 
problems and problems related to Lyapunov functions or the positive real lemma in control 
theory; we refer the reader to HmEHHEOlEa and the references given there. In this section, 
we describe an application in passive reduced-order modeling that leads to a class of NLSDPs. 

Roughly speaking, a system is called passive if it does not generate energy. For the special 
case of time-invariant linear dynamical systems, passivity is equivalent to positive realness of 
the frequency-domain transfer function associated with the system. More precisely, consider 
transfer functions of the form 

Znis) = B^{G + sCy^Bi, sgC, (7) 

where G, C € and Hi, B 2 € are given data matrices. The integer n is the state- 

space dimension of the time-invariant linear dynamical system, and m is the number of inputs 
and outputs of the system. In 0, the matrix pencil G + sG is assumed to be regular, i.e., 
the matrix G -|- sG is singular for only finitely many values of s G C. Note that is an 
m X m-matrix-valued rational function of the complex variable s G C. 

In reduced-order modeling, one is given a large-scale time-invariant linear dynamical system 
of state-space dimension N, and the problem is to construct a “good” approximation of that 
system of state-space dimension n N; see, e.g., m and the references given there. If the 
large-scale system is passive, then for certain applications, it is crucial that the reduced-order 
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model of state-space dimension n preserves the passivity of the original system. Unfortunately, 
some of the most efficient reduced-order modeling techniques do not preserve passivity. How¬ 
ever, the reduced-order models are often “almost” passive, and passivity of the models can be 
enforced by perturbing the data matrices of the models. Next, we describe how the problem 
of constructing such perturbations leads to a class of NLSDPs. 

An m X m-matrix-valued rational function Z is called positive real if the following three 
conditions are satished: 

(i) Z is analytic in C+ := { s G C | Re(s) > 0 }; 

(ii) Z{s) = Z{s) for all s G C; 

(hi) Z{s) + (^Z{s))'^ ^ 0 for all s G C+. 

For functions Zn of the form © positive realness (and thus passivity of the system associ¬ 
ated with Zn) can be characterized via linear SDPs; see, e.g., Enni and the references given 
there. More precisely, if the linear SDP 

P'^G + G^P P 0, 

P^c = G^P P 0, (8) 

P^B, = B2, 

has a solution P G then the transfer function m,Zn, is positive real. Conversely, under 

certain additional assumptions (see d), positive realness of Zn implies the solvability of the 
linear SDP Q. 

Now assume that Zn in 0 is the transfer function of a non-passive reduced-oder model 
of a passive large-scale system. Our goal is to perturb some of data matrices in 0 so that 
the perturbed transfer function is positive real. For the special case m = 1, such an approach 
is discussed in [^. In this case, there is a simple eigenvalue-based characterization jl] of 
positive realness. However, this characterization cannot be extended to the general case m > 1. 
Another special case, which leads to linear SDPs, is described in j^I. 

In the general case m > 1, we employ perturbations Xq and Xq of the matrices G and C 
in 0 . The resulting perturbed transfer function is then of the form 

Zn{s) = Bj{G + XG + siC + Xc))~^Bu (9) 

and the problem is to construct the perturbations Xq and Xc such that Zn is positive real. 
Applying the characterization 0 of positive realness to ®, we obtain the following nonconvex 
NLDSP: 

P^{G + Xg) + {G + Xg)^PP0, 

P'^{C + Xc) = {G + Xg)^PP0, (10) 

P^Bi = B2. 

Here, the unknowns are the matrices P, Xg, Xc G If m has a solution P, Xg, Xc, 

then choosing the matrices Xc and Xc as the perturbations in guarantees passivity of the 
reduced-order model given by the transfer function Zn- 
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4 Linear semidefinite programs 

In this section, we briefly review the case of linear semidefinite programs. 

Given a linear function A : —> S^, a vector b G M”, and a matrix C G 5"*, a pair of 

primal and dual linear semidefinite programs is as follows: 

maximize C •¥ subject to y G 5"*, 1" ^ 0, 

A*{Y) + h = {), 

and 

minimize x subject to x G M”, 

A{x) + C y 0. 

We remark that this formulation is a slight variation of the standard pair of primal-dual 
programs. We chose the above version in order to facilitate the generalization of problems of 
the form m to nonlinear semidefinite programs in standard form. 

If there exists a matrix Y y 0 that is feasible for CU), then we call Y strictly feasible for 
m and say that m satisfies Slater’s condition. Likewise, if there exists a vector x such that 
A{x) -|- C ^ 0, then we call strictly feasible and say that m satisfies Slater’s condition. 

The following optimality conditions for linear semidefinite programs are well known; see, 
e.g., m If problem m or m satisfies Slater’s condition, then the optimal values of m and 
m coincide. Furthermore, if in addition both problems are feasible, then optimal solutions 
Y°Y and of both problems exist and Y := Y°Y and x := x°Y satisfy the complementarity 
condition 

y5 = 0, where S ■.=—C — A{x). (13) 

Conversely, if Y and x are feasible points for m and m, respectively, and satisfy the 
complementarity condition m, then y°pt := y is an optimal solution of m and x°Y ■= x 
is an optimal solution of m- 

These optimality conditions can be summarized as follows. If problem m satisfies Slater’s 
condition, then for a point x G K” to be an optimal solution of m it is necessary and sufficient 
that there exist matrices y y 0 and 5^0 such that 

^(x) + C + 5 = 0, 

^*(y) + 6 = 0, (14) 

y5 = 0. 

Note that, in view of the second equation in m can also be written in the form 

V„ ((^(x) + C)*Y) + b = A*{Y) + b = 0. (15) 

Furthermore, the last equation in (d is equivalent to its symmetric form, y5 -|- SY = 0; see, 
e.g., [2j. In the case of strict complementarity, the derivatives of yS* = 0 and y5-|- S'y = 0 
are also equivalent. For later use, we state these facts in the following lemma. 

Lemma 1 Let y, 5 G 

a) If Y Y 0 or S' ^ 0, then 


( 11 ) 

( 12 ) 


ys = 0 


ys + sy = 0 . 


( 16 ) 
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b) If Y and S are strictly complementary, i.e., Y, SYO,YS = 0, and F + 5 0, then for 

any Y, S £ S^, 

YS + YS = 0 ^ YS + YS + SY + SY = 0. (17) 


Moreover, Y, S have representations of the form 


II 

■+1 o' 


s = u 

'0 o' 


0 0 



0 S2 


(18) 


where U is an m x m orthogonal matrix, Yi y 0 is a k x k diagonal matrix, and 52 0 

is an {m — k) X {m — k) diagonal matrix, and any matrices Y, S G satisfying 113 
are of the form 


Y = U 




S = U 




where YiSa + 1352 = 0. 


(19) 


Proof. The equivalence m is well known; see, e.g., [21 Page 749]. 

We now turn to the proof of part b). The strict complementarity of Y and 5 readily 
implies that Y and 5 have representations of the form (I18jl : see, e.g., |18[ Page 62]. Any 
matrices Y, S £ can be written in the form 


Y = U 


'Yi 



S = U 


■5i 

[4^ 



( 20 ) 


where U is the matrix from m and the block sizes in (PH) are the same as in m- Using m 
and (Oni) . it follows that the equation on the left-hand side of CZI) is satisfied if, and only if. 


yi5i = 0, +252 = 0, +153 + +3^2 = 0. 


Since li and S 2 are in particular nonsingular, the first two relations imply 5i = 0 and 12 = 0. 
Thus, any matrices Y, S £ satisfying the equation on the left-hand side of CZI) are of the 
form m- Similarly, using CHI) and (pi|>. it follows that the equation on the right-hand side 
of (tT7j) is satisfied if, and only if, 

yi5i + 51+1 = 0, +252 + 52+2 = 0, +i53 + +152 = 0. 


Since +1 + 0 and S 2 + 0, the first two relations imply 5i = 0 and +2 = 0, and so +, 5 are 
again of the form (1191) . g 


5 Nonlinear semidefinite programs 


In this section, we consider nonlinear semidefinite programs, which are extensions of the dual 
linear semidehnite programs Hi- 

Given a vector b £ M” and a matrix-valued function B : M"" ^ 5”*, we consider problems 
of the following form: 


minimize subject to x £ M"’, 

B{x) + 0. 


( 21 ) 
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Here, the function B is nonlinear in general, and thus m represents a class of nonlinear 
semidefinite programs. We assume that the function B is at least C^-differentiable. 

For simplicity of presentation, we have chosen a simple form of problem (HO). We stress 
that problem m may also include additional nonlinear equality and inequality constraints. 
The corresponding modifications are detailed at the end of this paper. Furthermore, the choice 
of the linear objective function b^x in m was made only to simplify notation. A nonlinear 
objective function can always be transformed into a linear one by adding one artificial variable 
and one more constraint. In particular, all statements about m in this paper can be modified 
so that they apply to additional nonlinear equality and inequality constraints and to nonlinear 
objective functions. 

Note that the class (j21 jl reduces to linear semidefinite programs of the form (1121) if B is an 
affine function. 

The Lagrangian £ : x 5™ —> M of is defined as follows: 

C{x,Y) :=b^x + B{x)»Y. (22) 

Its gradient with respect to x is given by 

g(x, Y) := V,£(x, Y) = b + V^ {B{x) • T) (23) 


and its Hessian by 

H{x, Y) := VlC{x, Y) = {B{x) . T). (24) 


If the problem m is convex and satisfies Slater’s condition then for each optimal 
solution x of (EU) there exists an m x m matrix T ^ 0 such that the pair {x,Y) is a saddle 
point of the Lagrangian (El, £. 

More generally, for nonconvex problems m, let x G M”' be a feasible point of m, and 
assume that the Robinson or Mangasarian-Fromovitz constraint qualification [HESlEni is 
satisfied at x, i.e., there exists a vector Ax 7 ^ 0 such that B{x) + DxB{x)[Ax] -< 0. Then, if 
X is a local minimizer of EU), the first-order optimality condition is satisfied, i.e., there exist 
matrices Y, S € 5™ such that 

H(x) + 5 = 0, 


g{x,Y) = 0, 
y5 = 0, 
y, 5 ^ 0. 


(25) 


The system (ESI is a straightforward generalization of the optimality conditions m and m, 
with the affine function A{x) -|- C in (1151) replaced by the nonlinear function B{x). 

Primal-dual interior-point methods for solving EU roughly proceed as follows. For some 
sequence of duality parameters > 0 , —> 0 , the solutions of the perturbed primal-dual 

system, 

H(x) + 5 = 0, 


g{x,Y) = 0, 

Y S — /XfcAn, 


(26) 


y, 5 ^ 0 , 


are approximated by some variant of Newton’s method. Since Newton’s method does not 
preserve any inequalities, the parameters /x^ > 0 are used to maintain strict feasibility, i.e., 
y. S' 0 for all iterates. 


The solutions of (EHl) coincide with the solutions of the standard logarithmic-barrier prob¬ 
lems for (HD- Moreover, the logarithmic-barrier approach for solving m can be interpreted 
as a certain choice of the ‘symmetrization operator’ for the equation, YS = fiklrm in the third 
row of (ESI); see Section El below. With this choice, the barrier function yields a very natural 
criterion for the step-size control in trust-region algorithms. The authors have implemented 
various versions of predictor-corrector trust-region barrier methods for solving m- For a 
number of examples, the running times of the resulting algorithms were comparable to the be¬ 
havior of interior-point methods for convex programs. However, the authors also encountered 
several instances in which the number of iterations for these methods was very high compared 
to the typical number of iterations needed for solving linear SDPs. For such negative examples 
it may be more efficient to solve a sequence of linear SDPs in order to obtain an approximate 
solution of m- This observation motivated the SSP method described in the next section. 

6 An SSP method for nonlinear semidefinite programs 

In this section, we introduce the sequential semidefinite programming (SSP) method, which is 
a generalization of the SQP method for standard nonlinear programs to nonlinear semidefinite 
programs of the form EU). For an overview of SQP methods for standard nonlinear programs, 
we refer the reader to E] and the references given there. 

In analogy to the SQP method, at each iteration of the SSP method one solves a subproblem 
that is slightly more difficult than the linearization of ESJ at the current iterate. More precisely, 
let (x^, Y^) denote the current point at the beginning of the fc-th iteration. One then determines 
corrections (Ax, AT) and a matrix S such that 

H(x*’) + D^B{x'^)[Ax] + 5 = 0, 
b + Rf^Ax + V^(H(x*’) • {Y^ + AT)) = 0, 

(Tfc + AT'=)5 = 0, 

Y^ + AT, 5 + 0. 

Here and in the sequel, we use the notation 

:=H(x^T^). 

Recall from ESI and EH) that g{x,Y) and H{x,Y) denote the gradient and Hessian with 
respect to x, respectively, of the Lagrangian £(x,T), of the nonlinear semidefinite pro¬ 
gram ED- Moreover, from ESI it follows that the linearization of g{x, Y) at the point (x^, T*’) 
is given by 


(27) 


(28) 


g{x^ + Ax, T^ + AT) « 6 + H^Ax + {B{x^) • (T^ + AT)). 

Thus the second equation in EH) is just the linearization of the second equation in EH). 
Furthermore, the first equation of EH) is a straightforward linearization of the first equation 
in ESI). This linearization is used in the same way in primal-dual interior methods. 

The last two rows in EH) and ES|) are identical when T in (ESI) is rewritten as T = T^+AT. 
In analogy to SQP methods for standard nonlinear programs, the problem of how to guarantee 
the nonnegativity constraints, namely B{x) + 0, is thus shifted to the subproblem (1771) . If the 
iterates x^ generated from (EZ|) converge, then their limit x automatically satisfies B{x) + 0. 
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In contrast, interior methods use perturbations, symmetrizations, and linearizations for the 
last two rows in H25f) . resulting in cheaper linear subproblems that are typically less ‘powerful’ 
than the subproblems (HH)- An important aspect for interior methods for linear SDPs is the 
choice of the symmetrization procedure for the bilinear equation YS = Hklm'-, see, e.g., m- 
For convex SDPs, theoretical convergence analyses are well developed and also supported by 
numerical evidence of rapid convergence. However, the generalization of these convergence 
results to nonlinear nonconvex subproblems is far from obvious. The proposed SSP method 
allows to apply the symmetrization to linear SDPs, thus reducing this aspect to a well-studied 
topic. 

In summary, both the problem of choosing a suitable symmetrization scheme and the 
problem of how to guarantee the nonnegativity constraints are shifted to the subproblem (EZl). 

Note that the conditions (EH) are the optimality conditions for the problem 

minimize b"’"Ax + H^Ax subject to Ax G M”", 

i3(x^) + D^B{x^) [Ax] ^ 0. 

The conditions (EH) and El have been considered in Equations (2.1) and (2.2)], with the 
remark that they have “been found to be an appropriate approximation of” m “for numerical 
purposes”. 

In order to be able to solve the subproblem El efficiently, in practice, one replaces the 
matrix in (|77|) . respectively El, by a positive semidefinite approximation of H^. As in 
the case of standard SQP methods, a BFGS update for the Hessian of the Lagrangian El, 
can be used to approximate by some positive semidefinite matrix H^. Given an estimate 

of for the current, A:-th, SSP iteration, the quasi-Newton condition to generate a BFGS 
update approximating the matrix for the next, {k + l)-st, SSP iteration can be 

derived as follows: 

Hk+iAx = V^(B(x*’+i) • Y^+^) - V^{B{x^) • y^+^) 

= V^£(x^+\ y^+^) - V^£(x^ Y^+^) (30) 

Ri VlC{x’^+\Y^+^) (x^+i - x^). 

If is positive semidefinite, the BFGS update with the above condition can be suitably 
damped such that is also positive semidefinite. At each iteration of the SSP method, 

problem El is solved with replaced by the matrix that is obtained by the BFGS 

update of from the previous SSP iteration. If is positive semidehnite, problem 

El essentially reduces to a linear SDP, since the convex quadratic term in the objective 
function can be written as a semidefiniteness constraint or a second-order-cone constraint. 
While the formulation as a second-order-cone constraint is more efficient, and for example, 
can be specified as input for the program package SeDuMi m in order to solve El , it was 
pointed out by m that it may be most efficient to use a program that is designed for SDPs 
with linear constraints and a convex quadratic objective function. 

It seems that many results for standard SQP methods carry over in a straightforward 
fashion to the SSP method. For example, the SSP method can be augmented by a penalty 
term in case that the subproblems El become infeasible. In this case, the right-hand sides, 
“0”, of the first three rows in El are replaced by weaker, penalized right-hand sides. Moreover, 
the convergence analysis of the method proposed in mang yields results that are comparable 
to the ones for standard SQP methods. 


10 


The standard analysis of quadratic convergence of SQP methods for nonlinear programs 
that satisfy strict complementarity conditions proceeds by first showing that the active con¬ 
straints will be identified correctly in the final stages of the algorithm and then using the 
equivalence of the SQP iteration and the Newton iteration for the simplified KKT-system in 
which only the active constraints are used. 

For nonlinear semidefinite programs the situation is slightly more complicated since it is 
difficult to identify active constraints. The paper m presents a proof that is based on a new 
approach by Bonnans et al. [7j and uses some general results due to Robinson m- It does 
not require strict complementarity and allows for approximate Hessian matrices in (Eni- 

In the next two sections, we present a more elementary and self-contained approach to 
analyze convergence of the SSP method under a strict complementarity condition. 

7 Sensitivity results 

In this section, we establish sensitivity results, first for the special case of quadratic semidefinite 
programs and then for general nonlinear semidefinite programs of the form m- More precisely, 
we show that strictly complementary solutions of such problems depend smoothly on the 
problem data. 

We start with quadratic semidefinite programs of the form 

minimize f(x) subiect to x € M”, 

A{x) + C ^0. 

Here, A : M"" —> tS™' is a linear function, C £ 5™, and / : M” —> M is a quadratic function 
defined by f{x) = b'^x + ^x"^Hx, where b £ and H £ 5”’. Note that we make no further 
assumptions on the matrix H. Thus, problem m is a general, possibly nonconvex, quadratic 
semidefinite program. 

The problem m is described by the data 

V :=[A,b,C,H]. (32) 

In Theorem n below, we present a sensitivity result for the solutions (EU) when the data P is 
changed to T> + AV where 

AV := [AA, Ab, AC, AH] (33) 

is a sufficiently small perturbation. We use the norm 

M :=(l|.4||2 + ||i.||2 + ||Cf+ ||H||2)i 

for data El and perturbations El- Recall that ||M|| is defined in ©• 

We denote by 

C^^\x, Y) := fix) + {A{x) +C)»Y 

the Lagrangian of problem El- Note that Vxfix) = b + Hx. Together with El); it follows 
that 

VxC^^Hx,Y) = b + Hx + A*iY) and VlC^^\x,Y) = H. 
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Recall that problem m is said to satisfy Slater’s condition if there exists a vector x E 
with A{x) + (7^0. Moreover, the triple (x, Y, S), where x £ and Y, S £ iS™, is called a 
stationary point of (jSU) if 

^(x) + (7 + S = 0, 

b + Hx + A*iY) = 0, 

_ _ _ _ (34) 

= 0 , 

y, 5 ^ 0 . 

Here, we have used equivalence (ITBl) of Lemma^and replaced RS" = 0 by its symmetric version, 
which is stated as the third equation of (EH). If in addition to (EH), one has 

Y + S^O, (35) 

then {x,Y,S) is said to be a strictly complementary stationary point of (IHTl) . 

Let X £ M”' be a feasible point of EU). We say that h £ h 7 ^ 0, is a feasible direction at 
X if X = x + e/i is a feasible point of EH for all sufficiently small e > 0. Following EHl Definition 
2 . 1 ], we say that the second-order sufficient condition holds at x with modulus /r > 0 if for all 
feasible directions h £ M"" at x with ifA{b Hx) = ItAY xf{x) = 0 one has 

h^Hh = {VlC^^\x,Y))h > p\\hf. (36) 

After these preliminaries, our main result of this section can be stated as follows. 

Theorem 1 Assume that problem (EB satisfies Slater’s condition. Let {x,Y,S) be a lo¬ 
cally unique and strictly complementary stationary point of with data CH), V, and as¬ 
sume that the second-order sufficient condition holds at x with modulus p. > 0. Then, for 
all sufficiently small perturbations EB, AD, there exists a locally unique stationary point 
(x(AD), y(AD), S(AD)) of the perturbed program (TWl) with data D + AD. Moreover, the 
point (x(AD), y(AD), F(AD)) is a differentiable function of the perturbation EB, and for 
AD = 0, (x(0), y(0), 5(0)) = (x,y,5). The derivative Dx)(x(0), y(0), 5(0)) at {x,Y,S) is 
characterized by the directional derivatives 

(x,y,5) :=D 2 ,(x( 0 ),y( 0 ), 5 ( 0 ))[AD] 

for any AD. Here (x, Y, 5) is the unique solution of the system of linear equations, 

A{x) + S = -AC - AA{x), 

Hx + A*{Y) = -Ab-AHx-AA*{Y), (37) 

y5 + y5 + 5y + 5y = o, 

for the unknowns x £ M"’, Y, S £ 5™. Finally, the second-order sufficient condition holds at 
x(AD) whenever AT is sufficiently small. 

Remark 1 Theorem ^ is an extension of the sensitivity result for linear semidefinite pro¬ 
grams presented in m- A related sensitivity result for linear semidefinite programs for a 
more restricted class of perturbations, but also under weaker assumptions, is given in m- A 
local Lipschitz continuity property of unique and strictly complementary solutions of linear 
semidefinite programs is derived in m- 
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Remark 2 While we did not explicitly state a linear independence constraint qualification, 
commonly referred to as LICQ, it is implied by our condition of uniqueness of the stationary 
point; see, e.g., m- Moreover, our assumptions on the stationary point {x, Y, S) imply that 
X is a strict local minimizer of m- 

Remark 3 The first and third equations in are symmetric m x m matrix equations, and 
so only their upper triangular parts have to be considered. Thus the total number of scalar 
equations in is + m + n. On the other hand, there are m? + m + n unknowns, namely 
the entries of x G Re"" and of the upper triangular parts of T, 5 G S^. Hence, (EZI) is a square 
system. 

Remark 4 In view of part b) of Lemma Q the last equation of (I.SZjl is equivalent to 

y 5 + y 5 = 0. (38) 

Thus, Theorem^ can be stated equivalently with in (E3)- However, the resulting system 
of equations dSZl) would then be overdetermined. 

Proof of Theorem m The proof is divided into four steps. 

Step 1. In this step, we establish the following result. If the perturbed program has a local 
solution that is a differentiable function of the perturbation, then the derivative is indeed a 
solution of (EZI). 

Slater’s condition is invariant under small perturbations of the problem data. Hence, if 
there exists a local solution x + Ax, S + A5 of the perturbed problem near x, S, then the 
necessary first-order conditions of the perturbed problem apply at x -|- Ax, S + AS, and state 
that there exists a matrix AY such that Y + Ay y 0, 5 + AS* y 0, and 

{A + AA){x + Ax) + C + AC + S + AS = 0, 
b + Ab+{H + AH){x + Ax) + {A* + AM*)(y + Ay) = 0, (39) 

(y + AY){S + AS) + (5 + A5)(y + Ay) = o. 

Subtracting from these equations the first three equations of El yields 

{A + AA){Ax) + AS= -AC - AM(x), 

{H + AH)Ax + {A* + AM*)(Ay) = -Ab - AHx - AM*(y), (40) 

Y AS + AY S + ASY + S AY = -AY AS - AS AY. 

Neglecting the second-order terms in El, and using El, we obtain the result claimed in EZI- 
It still remains to verify the existence and differentiability of Ax, Ay, AS". 

Step 2. In this step, we prove that the system of linear equations EZI has a unique solution. 
To this end, we show that the homogeneous version of El, i-e., the system 

yl(x) -1-5 = 0, 

Hx + A*{Y) = 0, (41) 

y5 + y5 + 5y + 5y = o, 

only has the trivial solution x = 0, y = 5 = 0. 
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Let X G Y, S € be any solution of (1411) . Recall that, in view of part b) by Lemma 
^we may assume that Y and S are given in diagonal form; 


Y = 


1 

0 

1^ 

, 5 = 

0 o' 

0 0 


0 52 


where li, S 2 0 and Yi, S 2 are diagonal. (42) 


Indeed, this can be done by replacing, in dlTl) . R, 5, Y, S, by U'^YU, U'^SU, U"'"YU, 
U'^SU, U'^A{x)U, respectively, where U is the matrix in (1181) . and then multiplying the hrst 
and third rows from the left by U and from the right by U'^. Furthermore, in view of part b) 
by Lemma n any matrices Y S € 5™ satisfying the last equation of m are then of the form 


Yi 

yl oj’ 


s 


'0 4' 
[si 52J 


where 1352 + Yi 53 = 0 . 


(43) 


Next, we establish the inequality 

x^Hx > /x||i:||^, 


(44) 


where /r > 0 is the modulus of the second-order sufficient condition (jSHI)- Assume that i: / 0 . 
Let X G M” be a Slater point for problem m- This guarantees that 


M = 


Ml 

Ml 


M 3 

M 2 


-(^(x) + C) ^ 0 , 


where the block partitioning M is conforming with dlSl). For 7 ] > 0, set 


(45) 


:= X + r]{x — x) and :=—x + r]{x — x). (46) 

Since x / 0, there exists an 770 > 0 such that A ^ all 0 < r/ < 770 - Next, we prove that 
for all such 77 , both vectors and h~ are feasible directions for m at X. Let 0 < 77 < 770 be 
arbitrary, but fixed. We then need to verify that A{x + ehl) + C ^ 0 for all sufficiently small 
e > 0. Recall that ^ is a linear function. Using (j46l) . (|3^, (EHl) . (j^^ . the first equation of 
(inil) . and the first equation of m, one readily verifies that 


A{x + eh^) + C 


0 0 
0 52 




77 M 1 rjMs ± S 3 

ivM3±S3V r7(M2-52)±52 


(47) 


Recall that r/ > 0 is fixed. Since, by (jlU) and (1151) . 52 0 and Mi 0, a standard Schur- 

complement argument shows that the matrix on the right-hand side of (gZl) is negative definite 
for all sufficiently small e > 0. Thus the vectors (I15|) are feasible directions for m at x for 
any rj > 0. This in turn implies 

xP"{b -|- Hx) = xFVxf{x) = 0. (48) 

Indeed, suppose that x'^Vxfix) < 0. Then, for sufficiently small r/ > 0, the feasible direction 
also satishes [h^)'^'Vxf{x) < 0 , and thus is a descent direction for the objective function 
/ of ((SU at the point x. This contradicts the local optimality of x. Likewise, if x^Vxf{x) > 0, 
then, for sufficiently small 77 > 0 , /i“ is a descent direction for the objective function / of m 
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at the point x, leading to the same contradiction. The second-order sufficient condition dsni) 
also holds true on the closure of the feasible directions. Since x is the limit of the feasible 
directions (EHl) for r/ —> 0 and x satishes (SHI), the inequality (EH) follows from 

Next recall from ( 1 ^ that Yi and S 2 are positive definite diagonal matrices. The last 
relation in (im thus implies that corresponding entries of the matrices I 3 and S 3 are either 
zero or of opposite sign. It follows that {^ 3 , 83 ) < 0, and equality holds if, and only if, 
Y 3 = S 3 = 0. Using this inequality, together with the first two relations in (ESI), the first two 
equations of m, and EH), one readily verifies that 

0 > 2 (Us, 53 ) = {Y,S) = -{Y,A{x)) = -{A*{Y),x) = {HxA) =x^Hx > fi\\xf. 


Since /U > 0, this implies 

X = 0 and Y 3 = S 3 = 0. 

By the first row of EH), it further follows that 

S = —A(x) = —A( 0 ) = 0 . 

Thus it only remains to show that Y = 0. In view of EH and EH, we have 


Ui O' 
0 0 


(49) 

(50) 


(51) 


Now suppose that Yi A 0- Then, by El and El, we have 

Y,:=Y + eYY0 and F, ^ T 

for all sufficiently small |e|. Moreover, using El, El, and EH, one readily verifies that the 
point {x,Yf:,S) also satisfies (|ni|) for all sufficiently small |e|. This contradicts the assumption 
that {x,Y,S) is a locally unique stationary point. Hence Fi = 0 and, by El, F = 0 . 

This concludes the proof that the square system El is nonsingular. 

Step 3. In this step, we show that the nonlinear system (1^ has a local solution that depends 
smoothly on the perturbation AD. To this end, we apply the implicit function theorem to the 
system 

A{x)+C + S = 0, Hx + b + A*{Y)=0, YS + SY = 0 . (52) 

As we have just seen, the linearization of at the point {x,Y,S) is nonsingular, and hence 
has a differentiable and locally unique solution (x(AD), T(AD), S'(AD)). Furthermore, we 
have F(AD), S{AT>) Y 0 . This semidehniteness follows with a standard continuity argument: 
The optimality conditions of the nonlinear SDP coincide with the optimality conditions of 
the linearized SDP. Under our assumptions, the latter one has a unique optimal solution that 
depends continuously on small perturbations of the data; see, e.g., m Hence the linearized 
problem at the data point D + AD has an optimal solution (x, F, S) that satisfies the same 
optimality conditions as (x(AD), F(AD), S(AD)). The solution of the linearized problem also 
satisfies F ^ 0 , 5 ^ 0 . Since (x(AD), F(AD), S'(AD)) is locally unique, it must coincide with 
[x,Y,S), i.e., F(AD), S(AD) satisfy the semidehniteness conditions. 

Step 4. In this step, we prove that the second-order sufficient condition is satished at the 
perturbed solution. Since feasible directions h are dehned only up to a positive scalar factor, 
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without loss of generality, one may require that ||/i|| = 1. For the unperturbed problem m, 
the second-order sufficient condition at x then states that h?"Hh > ^ for all h G M” with 
A{x + eh) + C ^ 0, e = e{h) > 0, ||/i|| = 1, and h"^(b + Hx) = 0. To prove that the second-order 
sufficient condition is invariant under small perturbations A2? of the problem data D, we thus 
need to show that for some fixed jl > 0, we have 

h^{H + AH)h > jl (53) 

for all solutions h G M” of the inequalities 

{A + A^) (x(AD) -|- e/i) -I- C -|- AC AO, e = e{h) > 0, 

\\h\\ = l, h^{b + Ab+{H + AH)x{AV))=0. 

In view of the first two relations in (El, the above is equivalent to 

eiA + AA)ih)AS{AV), e = e(/r) > 0, \\h\\ = 1, {A* + AA*)(YiAV)) = 0. (54) 

It remains to show that the set of solutions h of EH) varies continuously with AD. Indeed, for 
any fixed jl with 0 < jl < fi, the second-order condition at x then readily implies that El is 
satisfied for all solutions h of EH, provided ||A(D|| is sufficiently small. 

In Step 2, we have shown that both S{AD) and Y{A'D) are continuous functions of AT) 
and that the dimension of the null space of SjAD) is constant, namely equal to k, for all 
sufficiently small ||AZI||. Moreover, the null space of S{AD) varies continuously with AD. 

Let AVk be a sequence of perturbations with AVf^ —> 0. Let be a sequence of associated 
solutions of EH- It suffices to show that any accumulation point h of the sequence hk satisfies 
(tKlj) for AT) = 0 and the associated values AA = 0, T(0) = Y, S{0) = S. Since Y{AT)) and 
AM* vary continuously with AT), it follows that h satisfies the last two relations of (IMl) for 
AV = 0. 

We now assume by contradiction that eA{h) A S for any e > 0. Since S Y 0 this implies 
that there exists a vector z G with || 2 ;|| = 1, z'^A(h)z = e > 0, and z'^Sz = 0. It follows 
that 

z'^{A + AAk){hk)z > I 

if k is sufficiently large. Since the null space of S{AD) varies continuously with AD, we have 

(z -I- Azk)'^S{AT>k){z -b Azk) = 0 

for some small Azk G whenever ||A(Dfc|| is sufficiently small. We now choose ||A(Dfc|| so 
small, i.e., k so large, that 

(z -b Azk)'^iA + AAk){hk){z + Azk) > 

This implies that hk does not satisfy EH) and thus yields the desired contradiction. Hence h 
satisfies (inH) for AT) = 0. □ 

Theorem ^ can be sharpened slightly. 
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Corollary 1 Under the assumptions of Theorem^there exists a small neighborhood M of zero 
in the data space of (123 such that for all perturbations AV £ M of the problem data ((23; of 
there exists a local solution S^) of HdfAi near {x,Y,S), at which the assumptions 

of Theorem\^are also satisfied. Moreover, the seeond derivatives 

YI{xa,Ya,Sa)[AV] 

of such local solutions {xa, Ya, S'a) ore uniformly bounded for all AV £ Af. 

Proof. The first part of the corollary is an immediate consequence of Theorem ^ For the sec¬ 
ond part observe that the second derivative is obtained by differentiating the system For 
sufficiently small perturbations AV, the singular values of this system are uniformly bounded 
away from zero, and hence the second derivatives are uniformly bounded. g 

Theorem^can be generalized to the class of NLSDPs of the form (|21l) . Recall that, by (1221) 
and (El, the Lagrangian of El and its Hessian are given by 

C{x,Y) = b^x + B{x)»Y and H{x,Y) = Vl {13{x) • Y), (55) 

respectively. The generalization of Theorem^ to problems (j21j) is then as follows. 

Theorem 2 Let x* be a local solution of dH, and let Y* be an associated Lagrange multiplier. 
Assume that the Robinson constraint qualification is satisfied at x* and that the point (x*,Y*) 
is strietly complementary and locally unique. Finally, assume that the second-order sufficient 
condition holds at x* with modulus fi > 0. Then, (1171) has a locally unique solution for small 
perturbations of the data {B,b), and the solution depends smoothly on the perturbations. 

Proof. First, we define the linear function A := DxB{x*) : —> 5"*, and the matrices 

C := B{x*) and H := H{x*,Y*). Then, the SSP approximation (IIH) (with p = g = 0) of dlTl) 
at the point {x*,Y*) is just the quadratic semidefinite problem 

minimize b"^Ax^{Ax)^HAx subject to Ax G M”, 

^(Ax) + C ^ 0. 

Note that (1561) is a problem of the form m with data (jSl, V. Moreover, Ax := 0, Y := Y*, 
and S := — M(0) — C satisfy the conditions (IHl) . These conditions coincide with the first- 
order conditions of El, and thus the point {Ax,Y,S) is also the unique solution of (IHl) . 
Furthermore, the second-order sufficient condition for (El and El coincide. This condition 
guarantees that Ax is a locally unique solution of (El- Finally, the Robinson constraint 
qualification for problem El at X* implies that problem (El satisfies Slater’s condition. In 
particular, all assumptions of Theorem ^ are satisfied. Small perturbations AV of the data 
of (|21 () result in small changes of the corresponding SSP problem El- Since Theorem fallows 
for arbitrary changes in all of the data of (1561) . the claims follow. g 

8 Convergence of the SSP method 

In this section, we prove that the plain SSP method with step size 1 is locally quadratically 
convergent. 


17 


For pairs {x,Y), where x G M”, Y G 5"*, we use the norm 

||(x,y)|| :=(||xf+ ||y||2)^. 

The main result of this section can then be stated as follows. 

Theorem 3 Assume that the function B in is -differentiable and that problem <\21\ 
has a locally unique and strictly complementary solution {x, Y) that satisfies the Robinson 
constraint qualification and the second-order sufficient condition with modulus n > 0, cf. 

Let some iterate {x^,Y^) be given and let the next iterate 

yfc+i) _ Y^) + (Ax, AT) 

be defined as the local solution of or, equivalently, (Pj) . that is closest to {x^,Y^). Then 
there exist e > 0 and 7 < 1/e such that 

||(x'=+\y'^+i)-(x,y)|| < ^\\{x\Y^)-{x,Y)f 

whenever ||(x^,y^) — (x,y)|| < e. 

Proof. The proof is divided into three steps. In the first step, we establish the exact relation 
of problems (HH) and m- In a second step, we consider a point x^ near x. We show that 
x^ is the optimal solution of an SSP subproblem the data of which is at most 0{\\x^ — x||) 
away from the data of the SSP subproblem at (x, y). We remark that x^ is always the optimal 
solution of the {k — l)-st subproblem, but the data of this subproblem lies 0 {\\x^~^ — x||) away 
from the SSP subproblem at {x,Y). In a third step, we then show by a perturbation analysis 
that the correction Ax = — x^ is of size 0(||x^ — x|| + \\Y^ ~ ?||) and that the residual 

for the SSP subproblem in the {k + l)-st step is of size 0{{\\x^ — x|| + \\Y^ — y||)^). 

Step 1. We first show how the SSP subproblem (1291) can be written in the form (1311) . To this 
end, we define the linear function A := DxB{x^) : —> 5"*, and the matrices C := B{x^) and 

H := H{x^,Y^). Note that the linear constraint .A(Ax) + C y 0 is just the linearization of 
the nonlinear constraint B{x^ + Ax) y 0 about the point x^. Finally, let b be as in (I21j) . The 
SSP subproblem (P|) then takes the simple form (ra . and in particular, it conforms with the 
format (m|) of Theorem [3 

Step 2. Let any point x^ close to x be given. We show that Ax = 0 is a local solution 
of a problem of the form dsni), where the data is ‘close’ to the data of (EHI) at (x,y). Let 
AC := B{x) — B{x^). By continuity of B, ||AC|| is of order 0{\\x^ — x||). Let 

A6 := -V,, (b{x^) •Y^ - b = -A*{Y) - b. 

From (dl and the second row of (051) . we obtain the estimate ||A6|| = 0(||x*’ — x||) 
point (0, y,5) satisfies the hrst-order conditions, 

^(0) + C + AC + S = 0, b + Ab + H-0 + A*{Y) = 0, Y S = 0, 

for the quadratic semidefinite program 

minimize {b + Ab)'^Ax + ^(Ax)^FIAx subject to Ax G M”, 

^(Ax) + C +AC y 0. 


and the 


(57) 
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Let 


A:=D^13{x), b:=b, C:=B{x), H :=VlC{x,Y), (58) 

be the data of the SSP subproblem (OH at the point {x,Y). Then, the data of (071) differs 
from the data (p|) by a perturbation of norm 0{\\x^ — x\\ + ||y*^ — T||). Here, the term 
||yfc _ y|| reflects the fact that H and H also differ by the choice of Y. Note that the point 
(Ax, Y, S) = (0, Y, S) satishes the first-order optimality conditions, 

A{Ax)+C + S = 0, b + HAx + A*{Y) = 0 , YS = 0, (59) 

of the quadratic problem (EH) with data (EH- As shown in Theorem [TJ the assumptions for the 
nonlinear SDP (ITT!) at {x,Y) imply that problem (l56l) with data (l58|) satisfies all conditions of 
Theorem 121 at (0, y,S). 

Since the second-order sufficient condition depends continuously on the data of EU, it 
follows that for m, the second-order condition at (0, T, S') is satisfied, provided that ||x^ — x|| 
and II ~ y|| are sufficiently small. Thus problem (I57|) fulfills all assumptions of Theorem El 

Step 3. By definition, (Ax,Y) = (0, T) is the optimal solution (with associated multiplier) of 
EZl). Let (x^,y^) be close to {x,Y). The SSP subproblem replaces the data Ab and AC of 
(1571) by 0 (of the respective dimension). Thus, the data of (|57|) is changed by a perturbation of 
order 0(||x^ ~ ^11 + 11^^ ~ ^ID- We assume that this perturbation lies in the neighborhood M 
about zero as guaranteed by Corollary ^ Denote the optimal solution of the SSP subproblem 
by (Ax,y + AT). 

The SSP subproblem is then used to define {x^~^^,Y^^^). Let 

A+ := D^13{x’^+^), C+ := H(x*=+^), H+ := VIC{x^+\y'^+^) 

be the data of the SSP subproblem at the next, {k + l)-st iteration. 

Corollary n states that {Ax, AY) are given by the tangent equations (ITfl) plus some uni¬ 
formly bounded second-order terms. Thus, {Ax, AY) are of the order 0(||x*^ — x||-|-||y^ — y||). 
Here, Ay is a correction of the unknown point Y, while the correction Ay = y^+^ — Y^ 
produced by the SSP subproblem has the form Ay = Ay -|-y^ — Y. Obviously, also the norm 
||Ay|| of this correction is of the order 0(||x^ — x|| -|- ||y*^ — y||). 

Next, we compute an upper bound on the size of the residuals of the first and second 
equations in (EHI) at (x*^^^, y^^^, 5^^^). Note that the residual term of the third equation in 
([^ is zero. By definition of (Ax, 5^+^), it follows that 

A(Ax) + C + 5^+^ = 0, b +HAx + A*{Y'^+^) = 0 , Y^+^ = 0. (60) 

If the data of m is C^-smooth, this implies that 

(_4+)*(yfc+i) ^ ^ ^ A*{Y^+^) + b + AA*{Y^+^) 

= -HAx + AA*{Y^) + AA*(Ay) 

= -Vl{B{x^) • y'^) Ax + (V^H(x^ + Ax) - V^B{x^)) • y^ + AA*(Ay) 

= 0(||Ax||2 + ||Ay||2), 

where A A := A'^ — A. Likewise, it follows from (PH) that 

C+ + = AC+ C + 3’^+^ = AC- A(Ax) 

= H(x*=+i) - B{x^) - D^B{x’^)[Ax] = 0(||Axf), 
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where AC := C"*" — C. Hence, we can define perturbations b and C^ of b and C^ of order 
||6-6|| + ||C+-C+ll = 0((||x*^-x|| + ||y^-y||)2) such that (Ax,y,5) = (0,y'=+^5^+^) is 
an optimal solution of the problem (TO with data , b, C~^, H^. By the same derivation 
as above, the next SSP step has length 0{{\\x^ — x\\ + \\Y^ — y||)^) and generates residuals 
of order 0{{\\x^ — x|| + \\Y^ — y||)^). Repeating this process, it then follows by a standard 
argument that — x|| and — Y\\ are of order 0{{\\x^ — x\\ + \\Y^ — y||)^) as well. □ 

Remark 5 As mentioned before, one will typically choose to solve SSP subproblems with a 
positive semidefinite approximation H to the Hessian of the Lagrangian. A proof of convergence 
for such modifications is the subject of current research; see, e.g., m- Since all the data enters 
in a continuous fashion in the preceding analysis, it follows that the SSP method with step 
size one is still locally superlinearly convergent if the matrices in (EHl) are replaced by 
approximations with \\H^ — H^\\ 0. 

Remark 6 The assumption of C^-differentiability of the function B in Theorem 01 can be 
weakened to C^-differentiability and a Lipschitz condition for the Hessian at x. 


The result of Theorem 01 can be extended to the following slightly more general class of 
NLSDPs. Given a vector b G a matrix-valued function B : M” —> 5”^, and two vector¬ 
valued functions c : M” —> and d : M” ^ R*?, we consider problems of the following form: 


(61) 


minimize b'^x subject to x G R"', 

B{x) R 0, 
c(x) < 0, 
d{x) = 0 . 

The Lagrangian of problem (1211) takes the form C : R” x 5™ x R^ x R'^ —> R: 

£(x, y, u, v) := X -|- B{x) • y -|- u^c{x) -|- v^d{x). 

Its gradient with respect to x is given by 

g{x^ y, u, v) := Va;/l(x, Y,u,v) = b + {B{x) • y) -|- Vxc{x) u + Vxd{x) v 

and its Hessian by 

p 1 

H{x,Y,u,v) := VlC(x,Y,u,v) = Vl{B{x) •Y) + + ^VjVldj{x). (64) 


(62) 

(63) 


i=l 


1=1 


Note that in (pH). the gradients of the vector-valued functions c{x) and d{x) are defined as 
Vxc{x) := {Dxc{x))'^ and Vxd{x) := {Dxd{x)Y. 

For NLSDPs (161 1) . the SSP subproblems are of the form 


minimize ^'^Ax-(-|(Ax)^H*^Ax subject to Ax G R"’, 

B{x^) + DxB{x^)[B.x] < 0 , 
c(x^) -I- Dxc{x^) Ax < 0, 
d{x^) + Dxd{x^) Ax = 0. 

The extension of Theorem 131 to problems (inu is as follows. 


(65) 
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Theorem 4 Assume that the functions B, c, and d in are -differentiable, and that 
problem m has a locally unique and strictly complementary solution (x, Y, u, v) that satisfies 
the Robinson constraint qualification and the second-order sufficient condition with modulus 
/i > 0, cf. (I.V6D ■ Let some iterate ,v^) be given and let the next iterate 

v^) + (Ax, AT, Am, Am) 


be defined as the local solution of dSH) that is closest to {x^,Y^,v^). Then there exist e > 0 
and 7 < 1/e such that 

whenever ||(x*', T*', m^) — (x,y, m,m)|| < e. 


Proof. By our assumption on strict complementarity, all entries of the vector v of the Lagrange 
multipliers associated with the equality constraints d{x) = 0 of (121 1) are different from zero. 
Without loss of generality, we assume that u > 0. Indeed, for any entry Vj < 0 we replace 
the corresponding constraint dj{x) = 0 by the equivalent constraint —dj{x) = 0. These sign 
changes do not change the iterates generated by (EHl). Moreover, for (x^, sufficiently 

close to (x, Y, u, m) it follows from u > 0 that the iterates do not change when the constraints 
d{x) = 0 are replaced by d{x) < 0. We can thus assume that g = 0, i.e., there are no equality 
constraints in dSl). 

We further assume that, without loss of generality, the matrix B is augmented to a 2 x 2 
block diagonal matrix, where the (2,2)-block is the diagonal matrix Diag(c(x)). Thus, for the 
analysis of the SSP method we may assume that p = g = 0 in m, i-e., we only need to 
consider problems of the form (1211) . g 


9 Numerical results 

In this section, we present results of some numerical experiments with a Matlab implemen¬ 
tation of the SSP method. Actually, our Matlab program is for a slightly more general class 
of nonlinear programs with conic constraints (NLCPs). The numerical experiments with our 
program illustrate the theoretical results of the preceding sections. In particular, quadratic 
convergence is observed for problems where the Hessian H of the Lagrangian at the optimal 
solution is positive semidefinite. In cases where H is not positive semidefinite, our implemen¬ 
tation uses perturbations of the nonconvex SSP subproblems in order to obtain convex conic 
subproblems. In these cases, typically, the rate of convergence of the algorithm based on such 
perturbed problems is only linear. 

The Matlab program generates its search directions by solving conic quadratic subproblems 
using Version I.05R5 of SeDuMi m- SeDuMi allows free and positive variables as well as 
Lorentz-cone (“ice-cream cone”) constraints, rotated Lorentz-cone constraints, and semidefinite 
cone constraints. The NLCPs can also be formulated in terms of these cones. In order to 
simplify the use of SeDuMi for the SSP subproblems, the NLCPs are rewritten in the following 
standard format: 

minimize c^x subject to x £ iL, 

x) = 0. 
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Here, X is a Cartesian product of free variables and several cones of the types allowed in 
SeDuMi. 

We tested the following techniques for generating positive semidehnite approximations of 
H: a BFGS approach, the Hessian of the augmented Lagrangian, and the orthogonal projection 
of H onto the cone of positive semidehnite matrices. Our experiences with these techniques 
are as follows. 

1. The BFGS approach can result in considerably more SSP iterations compared to the 

projection of the Hessian of the Lagrangian. Moreover, the BFGS approach strongly 
depends on the initial matrix Hq. A good choice is Hq := Hmax(L), where 

H = VDV"^ is the eigenvalue decomposition of H. 

2. The use of the Hessian of the augmented Lagrangian can be a good choice for some 
problems, but for most of our test problems the penalty parameter had to be very large 
to obtain a semidehnite Hessian. This, in turn, signihcantly reduced the precision of our 
computations. 

3. In spite of not being affinely invariant, the use of the projection of the Hessian of the 
Lagrangian resulted in the most efficient overall algorithm. 

We also tested different step length strategies. 

1. The following penalty line search with a quadratic correction gave good results for all test 
cases. The SSP subproblem provides a search direction Ax for problem I®- By solving 
a least-squares problem, a vector q is computed satisfying DxF{x)q = —F{x + Ax). For 
A G [0,1], a line search along the points x(A) ;= x -|- AAx + X^q is performed based on 
the penalty function 

M||F(x(A))||+c^x(A), 
where M > 0 is a penalty parameter. 

2. For some examples, the choice of a hlter approach was slightly better. In the hlter ap¬ 
proach used here, a Euclidean trust-region radius was always set to be 1.5 times larger 
than the previous step, and non-acceptable steps were not discarded, but instead an 
Armijo-type step-length reduction was used to generate an acceptable step. The moti¬ 
vation for this modified filter strategy lies in the fact that the computation of a solution 
of a subproblem is very expensive, and therefore discarding the solution of a subproblem 
is avoided. The above filter approach led to very fast convergence, especially for convex 
problems. 

3. For the examples presented here, the trust-region approach was the best choice. The 
SSP subproblem was restricted by an additional Euclidean trust-region constraint. Eor 
problems of the form (inzi below, it was sufficient to apply the trust-region constraint 
only to the variables Xq, Xc, while P, S remain free. Eor these examples, an additional 
corrector step significantly accelerated the convergence. Eor this corrector step, Xq, Xc 
is kept fixed, and P, S are updated by solving an additional linear SDP. At each iteration, 
the ratio between predicted and actual reduction was computed. Depending on that ratio, 
the step was accepted and the trust region was increased or decreased, or the step was 
rejected and the trust region was decreased. 
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For our numerical examples, we use nonlinear nonconvex SDPs of the form m, which 
we rewrite in the form (jHTj) below. Recall that in (nT?|) . G, C ^ and Bi, B 2 G 

are given data matrices. The nonconvex NLSDP used for the numerical examples is then as 
follows: 


minimize ||5|| subject to P G S G 

XgGR-x-, \\XG\\<rG, 

XcGR’^^^ \\XG\\<rc, 

P^Pi + S = P2, (67) 

P^(G + Xg) + {G + Xg^P h egI, 

P^{G + Xg) + {G + Xc)^P p eel, 

P^{G + Xg)-{G + Xc)^P = 0. 

Furthermore, in (FT>. in addition to the constraints on the norms of the perturbations Xg and 
Xg, we restrict Xg and Xg to have possible nonzero entries only in certain positions, which 
depend on the nonzero structure of the given matrices G and G, respectively. For our numerical 
tests, the data matrices G and G in (|H7|) are generated as follows. First, two matrices Gorg and 
Gorg were constructed such that the associated transfer function is guaranteed to be positive 
real. Then, certain entries of Gorg and Gorg were perturbed by random perturbations of norm 
at most £g and £c respectively, to dehne the data matrices G and G. In all our examples, the 
transfer functions of the systems given by the resulting matrices G and G were not positive 
real. 

All our computations were run on a Xeon with a clock rate of 2.8 GHz and 3 GB RAM. All 
solutions were computed to a precision of 12 decimal digits. In the following table, we list the 
problem dimension n, the total number M{n) of equality constraints, the total number N[n) 
of scalar unknowns, the number of iterations, and the epu time (in seconds). 
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n 

M{n) 

N{n) 

iterations 

cpu time 

8 

118 

285 

5 

3.71 

9 

146 

348 

5 

4.43 

10 

177 

417 

7 

8.06 

11 

211 

492 

5 

7.18 

12 

248 

573 

8 

16.05 

13 

288 

660 

4 

10.88 

14 

331 

753 

6 

20.77 

15 

377 

852 

7 

30.12 

16 

426 

957 

6 

34.38 

17 

478 

1068 

5 

37.40 

18 

533 

1185 

10 

91.17 

19 

591 

1308 

4 

47.61 

20 

652 

1437 

5 

83.66 

21 

716 

1572 

4 

289.48 

22 

783 

1713 

4 

416.31 

23 

853 

1862 

5 

151.33 

24 

926 

2013 

6 

683.54 

25 

1002 

2176 

3 

145.60 

26 

1081 

2337 

5 

612.22 

27 

1163 

2508 

7 

518.92 

28 

1248 

2685 

5 

789.41 

29 

1336 

2868 

4 

475.52 

30 

1427 

3057 

7 

4213.50 

31 

1521 

3252 

4 

784.34 

32 

1618 

3455 

6 

4659.64 

33 

1718 

3660 

5 

1130.44 

34 

1821 

3877 

2 

630.53 

35 

1927 

4092 

6 

1799.36 


Table ini shows that the number of iterations is nearly independent of the dimension n of the 
problem, while—as expected—the cpu time increases with n. The total number of constraints 
is approximately M(n) ~ and the total number of scalar variables is approximately 

N{n) ~ 3n^. The number of iterations to solve the linear semidefinite subproblems not only 
depends on the dimension, but also on other properties of the problem as, for example, a 
comparison of the problems of dimension 32 and 33 shows. In this case, the iteration counts 
differ only by one, yet the cpu time quadruples, since SeDuMi needs more iterations to solve 
the subproblems. Some of the linear semidefinite subproblems are nearly infeasible, a situation 
for which SeDuMi (and other solvers) needs a higher number of interior-point iterations. 


10 Concluding remarks 

We have discussed the SSP method, which is a generalization of the SQP method for standard 
nonlinear programs to nonlinear semidefinite programming problems. For the derivation of 
this generalization, we have chosen a motivation that contrasts the SSP method with primal- 
dual interior methods. For interior methods that are applied directly to nonlinear semidefinite 


24 









programs, the choice of the symmetrization procedure is considerably more complicated than 
in the linear case since the system matrix is no longer positive semidefinite. In the proposed 
method, the choice of the symmetrization scheme is shifted in a very natural way to the 
subproblems, and is thus reduced to a well-studied problem. Our convergence analysis differs 
from the convergence analyses of standard SQP methods in that it is based on a sensitivity 
result for certain optimal solutions of quadratic semidefinite programs. The derivation of this 
sensitivity result is also of independent interest. 
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